A Review of Large-Scale RDF Document Processing in Hadoop MapReduce Framework

نویسنده

Khushboo Tiwari

چکیده

Resource Description Framework (RDF) is Meta data model which can be used to store Meta data information of various large complex datasets which can further be used to extract or infer some meaningful out of it. To process these vast amount of data and infer some reasoning out of these is tedious task. Traditional centralized reasoning methods are not sufficient to process large ontologies. Distributed reasoning methods are thus required to improve the scalability and performance of inferences. MapReduce is widely used parallel computing model which able to process these huge amount of data in no time. Various methods are introduced to process RDF document. This paper reviews various methods and techniques used for processing RDF documents. Keywords— Information retrieval, RDF Document, Semantic Information, XML, MapReduce, Hadoop.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

A Scalable RDF Data Processing Framework based on Pig and Hadoop

In order to effectively handle the growing amount of available RDF data, scalable and flexible RDF data processing frameworks are needed. While emerging technologies for Big Data, such as Hadoop-based systems that take advantages of scalable and fault-tolerant distributed processing, based on Google’s distributed file system and MapReduce parallel model, have become available, there are still m...

متن کامل

Distributed RDF Triple Store Using HBase and Hive

The growth of web data has presented new challenges regarding the ability to effectively query RDF data. Traditional relational database systems efficiently scale and query distributed data. With the development of Hadoop its implementation of the MapReduce Framework along with HBase, a NoSQL data store, the semantics of processing and querying data has changed. Given the existing structure of ...

متن کامل

Survey on Task Assignment Techniques in Hadoop

MapReduce is an implementation for processing large scale data parallelly. Actual benefits of MapReduce occur when this framework is implemented in large scale, shared nothing cluster. MapReduce framework abstracts the complexity of running distributed data processing across multiple nodes in cluster. Hadoop is open source implementation of MapReduce framework, which processes the vast amount o...

متن کامل

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

MapReduce[5] is an emerging programming model that utilizes distributed processing elements (PE) on large datasets. With this model, programmers can write highly parallelized code without explicitly dealing with task scheduling and code parallelism in distributed systems. In this paper, we comparatively evaluate the performance of MapReduce model on Hadoop[2] and on Mars[3]. Hadoop is a softwar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

A Review of Large-Scale RDF Document Processing in Hadoop MapReduce Framework

نویسنده

چکیده

منابع مشابه

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

A Scalable RDF Data Processing Framework based on Pig and Hadoop

Distributed RDF Triple Store Using HBase and Hive

Survey on Task Assignment Techniques in Hadoop

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

عنوان ژورنال:

اشتراک گذاری